National Repository of Grey Literature 10 records found  Search took 0.00 seconds. 
Convolutional Networks for Lip Reading
Kadleček, Josef ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This thesis deals with current methods for automatic speech recognition and lip reading via neural networks. Furthermore it deals with similarities in the architectures of neural networks for audio and visual data and available datasets in the field of audiovisual automatic speech recognition. The main contribution of this thesis is set of experiments comparing different changes in neural network architecture and its impact on results. The thesis includes an implementation of a system for automatic speech recognition from audio (CER: 12.6 %) and visual (CER: 57,7 %) data. The architectures of both systems are based on features extraction via convolutional networks followed by recurrent layers LSTM, another layer of convolutions and loss function CTC. 
Deep Networks for Handwriting Recognition
Richtarik, Lukáš ; Herout, Adam (referee) ; Hradiš, Michal (advisor)
The work deals with the issue of handrwritten text recognition problem with deep neural networks. It focuses on the use of sequence to sequence method using encoder-decoder model. It also includes design of encoder-decoder model for handwritten text recognition using a transformer instead of recurrent neurons and a set of experiments that were performed on it.
Chatbot Based on Artificial Neural Networks
Richtarik, Lukáš ; Beneš, Karel (referee) ; Szőke, Igor (advisor)
This work deals with the issue of chatbots, which are based on artificial neural networks and generative models. It also describes options and process of designing the chatbot as well as an implementation and testing using BLEU metrics. The work contains multiple experiments with different models of chatbots, their performance evaluation and comparison, user experience and several suggestions for future enhancements.
Improving Consistency in Text Recognition Datasets
Tvarožný, Matúš ; Hradiš, Michal (referee) ; Kišš, Martin (advisor)
This work is concerned with increasing the consistency of datasets for text recognition. This paper describes the problems that cause the inconsistency and then presents solutions to eliminate it. The effect of the properties of the polygons defining the text line boundaries and hence how the modified version of the dataset, which is composed of ideal text line variants, affected the accuracy of the model is investigated. Further, the work focuses on detecting and then removing or modifying text lines whose ground truth transcription does not match the actual text they contain. Experimentation showed that removing the visual inconsistency on the training set did not have a significant effect on the trained model, but modifying the test set improved the OCR accuracy of the model by 1.1\% CER. By modifying the dataset so that it did not contain mutually inconsistent pairs of recognized text and the corresponding ground truth, the model improved by a maximum of only 0.2\% CER after re-training. The main finding of this work is, above all, the proven beneficial effect of removing inconsistencies on test suites, thanks to which it is possible to determine a more realistic error rate of the OCR model.
Machine Translation Using Artificial Neural Networks
Holcner, Jonáš ; Beneš, Karel (referee) ; Szőke, Igor (advisor)
The goal of this thesis is to describe and build a system for neural machine translation. System is built with recurrent neural networks - encoder-decoder architecture in particular. The result is a nmt library used to conduct experiments with different model parameters. Results of the experiments are compared with system built with the statistical tool Moses.
Improving Consistency in Text Recognition Datasets
Tvarožný, Matúš ; Hradiš, Michal (referee) ; Kišš, Martin (advisor)
This work is concerned with increasing the consistency of datasets for text recognition. This paper describes the problems that cause the inconsistency and then presents solutions to eliminate it. The effect of the properties of the polygons defining the text line boundaries and hence how the modified version of the dataset, which is composed of ideal text line variants, affected the accuracy of the model is investigated. Further, the work focuses on detecting and then removing or modifying text lines whose ground truth transcription does not match the actual text they contain. Experimentation showed that removing the visual inconsistency on the training set did not have a significant effect on the trained model, but modifying the test set improved the OCR accuracy of the model by 1.1\% CER. By modifying the dataset so that it did not contain mutually inconsistent pairs of recognized text and the corresponding ground truth, the model improved by a maximum of only 0.2\% CER after re-training. The main finding of this work is, above all, the proven beneficial effect of removing inconsistencies on test suites, thanks to which it is possible to determine a more realistic error rate of the OCR model.
Deep Networks for Handwriting Recognition
Richtarik, Lukáš ; Herout, Adam (referee) ; Hradiš, Michal (advisor)
The work deals with the issue of handrwritten text recognition problem with deep neural networks. It focuses on the use of sequence to sequence method using encoder-decoder model. It also includes design of encoder-decoder model for handwritten text recognition using a transformer instead of recurrent neurons and a set of experiments that were performed on it.
Convolutional Networks for Lip Reading
Kadleček, Josef ; Kišš, Martin (referee) ; Hradiš, Michal (advisor)
This thesis deals with current methods for automatic speech recognition and lip reading via neural networks. Furthermore it deals with similarities in the architectures of neural networks for audio and visual data and available datasets in the field of audiovisual automatic speech recognition. The main contribution of this thesis is set of experiments comparing different changes in neural network architecture and its impact on results. The thesis includes an implementation of a system for automatic speech recognition from audio (CER: 12.6 %) and visual (CER: 57,7 %) data. The architectures of both systems are based on features extraction via convolutional networks followed by recurrent layers LSTM, another layer of convolutions and loss function CTC. 
Chatbot Based on Artificial Neural Networks
Richtarik, Lukáš ; Beneš, Karel (referee) ; Szőke, Igor (advisor)
This work deals with the issue of chatbots, which are based on artificial neural networks and generative models. It also describes options and process of designing the chatbot as well as an implementation and testing using BLEU metrics. The work contains multiple experiments with different models of chatbots, their performance evaluation and comparison, user experience and several suggestions for future enhancements.
Machine Translation Using Artificial Neural Networks
Holcner, Jonáš ; Beneš, Karel (referee) ; Szőke, Igor (advisor)
The goal of this thesis is to describe and build a system for neural machine translation. System is built with recurrent neural networks - encoder-decoder architecture in particular. The result is a nmt library used to conduct experiments with different model parameters. Results of the experiments are compared with system built with the statistical tool Moses.

Interested in being notified about new results for this query?
Subscribe to the RSS feed.